UPV-SI: Word Sense Induction using Self Term Expansion
نویسندگان
چکیده
In this paper we are reporting the results obtained participating in the “Evaluating Word Sense Induction and Discrimination Systems” task of Semeval 2007. Our totally unsupervised system performed an automatic self-term expansion process by mean of co-ocurrence terms and, thereafter, it executed the unsupervised KStar clustering method. Two ranking tables with different evaluation measures were calculated by the task organizers, every table with two baselines and six runs submitted by different teams. We were ranked third place in both ranking tables obtaining a better performance than three different baselines, and outperforming the average score.
منابع مشابه
Word Sense Induction in the Arabic Language: A Self-Term Expansion Based Approach
The aim of the word sense induction/discrimination task of natural language processing is to discover the sense associated to each instance of a given ambiguous word. In this paper we present an approach based on clustering of a self-expanded version of the original dataset in order to tackle this particular problem. The self-expansion technique substitutes every term of the original corpus wit...
متن کاملAnalyse et expansion des textes en question-réponse
This paper presents an original methodology to consider question answering. We noticed that query expansion is often incorrect because of a bad understanding of the question. But the automatic good understanding of an utterance is linked to the context length, and the question are often short. This methodology proposes to analyse the documents and to construct an informative structure from the ...
متن کاملWord Sense Discrimination Using Context Vector Similarity
This paper presents the application of context vector similarity for the purpose of word sense discrimination during query translation. The random indexing vector space method is used to accumulate the context vectors. Pair wise similarity of the context vectors of ambiguous terms with that of anchor terms indicated the possible correct translation of a query term. Two retrieval experiments wer...
متن کاملAn Evaluation of Graded Sense Disambiguation using Word Sense Induction
Word Sense Disambiguation aims to label the sense of a word that best applies in a given context. Graded word sense disambiguation relaxes the single label assumption, allowing for multiple sense labels with varying degrees of applicability. Training multi-label classifiers for such a task requires substantial amounts of annotated data, which is currently not available. We consider an alternate...
متن کاملThe UPV at GeoCLEF 2007
In this work we attempted to determine the relative importance of the geographical and WordNet-extracted terms with respect to the remainder of the query. Our system is based on Lucene and uses LingPipe for Named Entity recognition. Geographical terms are expanded with WordNet holonyms and synonyms and indexed separately. We checked the relative importance of the terms by boosting them with red...
متن کامل